ULTRAMETRIC SKELETONS 
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Abstract. We prove that for every s £ (0, 1) there exists C e € (0, oo) with the following property. 
If (X, d) is a compact metric space and \i is a Borel probability measure on X then there exists 
a compact subset S C X that embeds into an ultrametric space with distortion 0(l/e), and a 
probability measure v supported on S satisfying v (Bd{x,r)) ^ (/i(Bd{x,C £ r)) ~ E for all x £ X 
and r £ (0, oo). The dependence of the distortion on e is sharp. We discuss an extension of this 
statement to multiple measures, as well as how it implies Talagrand's majorizing measures theorem. 



1. Introduction 

Our main result is the following theorem. 

Theorem 1.1. For every e G (0, 1) there exists C e G (0, oo) with the following property. Let (X, d) 
be a compact metric space and let p be a Borel probability measure on X. Then there exists a 
compact subset SOX satisfying 

(1) S embeds into an utrametric space with distortion 0{l/e). 

(2) There exists a Borel probability measure supported on S satisfying 

u(B d (x,r)) < (n{B d {x,C E r)f- e (1) 
for all x G X and r G [0, oo). 

Recall that an ultrametric space is a metric space (U, p) satisfying the strengthened triangle in- 
equality p(x, y) ^ max{/)(x, z), p(y, z)} for all x,y,z G U. Saying that (5, d) embeds with distortion 
D G [1, oo) into an ultrametric space means that there exists an ultrametric space (U,p) and an 
injection / : S — ^ U satisfying d(x,y) ^ p(f(x),f(y)) ^ Dd(x,y) for all x,y G S. In the statement 
of Theorem II. 2| and in the rest of this paper, given a metric space (X, d), a point x G X and a 
radius r G [0, oo), the corresponding closed ball is denoted Bd(x,r) = {y G X : d(y,x) ^ r}, and 
the corresponding open ball is denoted B^(x,r) = {y G X : d(y,x) < r}. (We explicitly indicate 
the underlying metric since the ensuing discussion involves multiple metrics on the same set.) 

We call the metric measure space (S, d, v) from Theorem II .11 an ultrametric skeleton of the metric 
measure space (X, d, p) . The literature contains several theorems about the existence of "large" 
ultrametric subsets of metric spaces; some of these results will be mentioned below. As we shall 
see, the subset S of Theorem 11.11 must indeed be large, but it is also geometrically "spread out" 
with respect to the initial probability measure p. For example, if p assigns positive mass to two 
balls Bd(x, r) and Bd(y, r), where x, y G X satisfy d(x, y) > C £ {r + 1), then the probability measure 
i/, which is supported on S, cannot assign full mass to any one of these balls. This is one reason 
why (S,d,v) serves as a "skeleton" of (X, d, p). 

More significantly, we call (S, d, p) an ultrametric skeleton because it can be used to deduce 
global information about the entire initial metric measure space (X, d, p). Examples of such global 
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applications of statements that are implied by Theorem ll.il are described in |13|, I14j. and an addi- 
tional example will be presented below. As a qualitative illustration of this phenomenon, consider 
a stochastic process {Zt}teT, assuming for simplicity that the index set T is finite and that each 
random variable Zt has finite second moment. Equip T with the metric d(s,t) = [(Z s — Zt) 2 ]. 
Assume that there exists a unique (random) point r E T satisfying Z T = maxtgj Zt- Let \x be 
the law of r, and apply Theorem 11.11 say, with e = 1/2, to the metric measure space (X,d,/J,). 
One obtains a subset SCI that embeds into an ultrametric space with distortion O(l), and a 
probability measure v that is supported on S and satisfies ([1]) (with e = 1/2). If a E S is a random 
point of 5 whose law is v, then it follows that for every x E T, p E (0, 1) and r E [0, oo), if a falls 
into Bd(x,r) with probability at least p, then the global maximum r falls into Bd(x,0(r)) with 
probability at least p 2 . One can therefore always find a subset of T that is more structured due 
to the fact that it is approximately an ultrametric space (e.g., such structure can be harnessed for 
chaining-type arguments), yet this subset reflects the location of the global maximum of {Zt}teT m 
the above distributional/geometric sense. A quantitatively sharp variant of the above qualitative 
interpretation of Theorem 11.11 is discussed in Section II. 1.11 below. 

1.1. Nonlinear Dvoretzky theorems. Nonlinear Dvoretzky theory, as initiated by Bourgain, 
Figiel and Milman [1] , asks for theorems asserting that any "large" metric space contains a "large" 
subset that embeds with specified distortion into Hilbert space. We will see below examples of 
notions of "largeness" of a metric space for which a nonlinear Dvoretzky theorem can be proved. For 
an explanation of the relation of such problems to the classical Dvoretzky theorem [5] , see [U [Tj E] . 
Most known nonlinear Dvoretzky theorems actually obtain subsets that admit a low distortion 
embedding into an ultrametric space. Since ultrametric spaces admit an isometric embedding into 
Hilbert space [17] , such a result falls into the Bourgain-Figiel-Milman framework. Often (see [U [T] ) 
one can prove an asymptotically matching impossibility result which shows that all subsets of a 
given metric space that admit a low distortion embedding into Hilbert space must be "small" . Thus, 
in essence, it is often the case that the best way to find an almost Hilbertian subset is actually to 
aim for a subset satisfying the seemingly more stringent requirement of being almost ultrametric. 

Apply Theorem 11.11 to an n-point metric space (X,d), with u({x}) = 1/n for all x E X. Since 
v is a probability measure on S, there exists x E X with v({x}) ^ 1/|>S|. An application of ([1]) 
with r = shows that 1/\S\ < ^({x}) 1 ^ = l/n 1 ' 6 , or \S\ ^ n 1 e . Since S embeds into an 
ultrametric space with distortion 0(l/e), this shows that Theorem 11.11 implies the sharp solution of 
the Bourgain-Figiel-Milman nonlinear Dvoreztky problem that was first obtained in |13j . Sharpness 
in this context means that, as shown in [1], there exists a universal constant c E (0, oo) and for 
every n E N there exists an n-point metric space X n such that every S C X n with \S\ n 1_e 
incurs distortion at least c/e in any embedding into Hilbert space. Thus, the distortion bound in 
Theorem 11.11 cannot be improved (up to constants), even if we allow S to embed into Hilbert space. 

Assume that (X,d) is a compact metric space of Hausdorff dimension greater than a E (0, oo). 
Then there exists [9l QTj an a-Frostman measure on (X,d), i.e., a Borel probability measure \i 
satisfying u(^(x,r)) Kr a for every x E X and r E (0,oo), where K is a constants that may 
depend on X and a but not on x and r. An application of Theorem ll.ll to (X, d, u) yields a compact 
subset S Q X that embeds into an ultrametric space with distortion 0(l/e), and a Borel probability 
measure v supported on S satisfying v(B<i(x,r)) ^ /i(S^(x, C £ r)) l ~ e ^ K 1 ~ e Ce 1 e ^ a r ( 1 - £ ) a f or a il 
x E X and r E (0,oo). Hence v is a (1 — e)a-Frostman measure on S, implying [TTJ that S 
has Hausdorff dimension at least (1 — e)a. Thus Theorem 11.11 implies the sharp solution of Tao's 
nonlinear Dvoretzky problem for Hausdorff dimension that was first obtained in [14J. 

More generally, the following result was proved in [H] as the main step towards the solution of 
Tao's nonlinear Dvoretzky problem for Hausdorff dimension. 
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Theorem 1.2. For every e G (0, 1) there exists c e = e 0<yl / £2 ^ G (0, oo) with the following property. 
Let (X,d) be a compact metric space and let p be a Borel probability measure on X. Then there 
exists a compact subset SCI satisfying 

(1) S embeds into an utrametric space with distortion 0(l/e). 

(2) //{xi} ie / C X and {ri} ieI C [0, oo) satisfy |J g/ B d (xi,ri) D S then 

'^(Bdix^cen)) 1 -^ 1. (2) 
iel 

Theorem 11.21 is a consequence of Theorem II .li Indeed, if S C X and v are the subset and prob- 
ability measure from Theorem II .1] then 1 = v(S) ^ Yliei u (Bd( x i, r i)) ^ Yliel M-^d( x «' C e rj)) 1_e 
whenever IJie/ ^d( x i, r i) 5 S. But, Theorem 11.21 is the main reason for the validity of the phe- 
nomenon described in Theorem 11.11 here we show how to formally deduce Theorem 11.11 from 
Theorem 11.21 with C £ = 0(c e /e) = e 0<yl / e \ Alternatively, with more work, one can repeat the 
proof of Theorem 11.21 in |14j while making changes to several lemmas in order to prove Theorem 11.11 
directly, and obtain C £ = c £ . Since the proof of Theorem 11.21 in [14] is quite involved, we believe 
that it is instructive to establish Theorem ll.il via the argument described here. 

1.1.1. Majorizing measures and stochastic processes. Theorem 11.11 makes it possible to relate the 
nonlinear Dvoretzky framework of [H] to Talagrand's nonlinear Dvoretzky theorem [16], and con- 
sequently to Talagrand's majorizing measures theorem [TBJ. Given a metric space (X, d) let £?x be 
the Borel probability measures on X. The Fernique-Talagrand 72 functional is defined as follows. 

l2 {X,d) = inf sup/ Jlogf f J ^ ]dr. (3) 




Talagrand's nonlinear Dvoretzky theorem [16] asserts that every finite metric space (X, d) has a 
subset SCI that embeds into an ultrametric space with distortion 0(1) ancfl 7 2 (5, d) > 72 (X, d). 
Talagrand proved this nonlinear Dvoretzky theorem in order to prove his celebrated majorizing 
measures theorem, which asserts that if {O^}^ 

e x is a Gaussian process and for x,y G X we set 
d(x, y) = yTE [(G x — G y ) 2 ], then E [sup^gx G x ] > 72 (X, d). There is also a simpler earlier matching 
upper bound due to Fernique [6J, so E [sup^g^ G x ] x 72 (X, d). The fact that Talagrand's nonlinear 
Dvoretzky theorem implies the majorizing measures theorem is simple; see |16L Prop. 13] and also 
the discussion in |14|, Sec. 1.3]. 

To understand the link between Theorem 11.11 and Talagrand's nonlinear Dvoretzky theorem, 
consider the following quantity, associated to every compact metric space (X,d). 




6 2 (X,d)= sup inf / t/log ( tj3(r _ - - )dr. (4) 

Intuitively, 72 (X, d) should be viewed as a multi-scale version of a covering number, while 62 (X, d) 
should be viewed as a multi-scale version of a packing number. It is therefore not surprising that 
72(1, d) x ^2(1, d). In fact, in Section[3]we note that 72 (f7, p) = ^(f/, p) for every finite ultrametric 
space (U,p), and 5a (X, d) ^ j2(X,d) for every finite metric space (X, d) (the latter inequality is 
an improvement of our original bound 62(X,d) > 72 (X, d), due to an elegant argument of Witold 
Bednorz [2]). The remaining estimate 6 2 (X, d) < 72 (X, d) will not be needed here, though it follows 
from our discussion (see Remark 1 1 .3H . and it also has a simpler direct proof. 



Let p G satisfy 62 (X, d) = inf^gx -y/log (l/p(B(x, r)))dr. Theorem ll.ll applied to (X, d, p) 
yields SCI and an ultrametric p : S x S — > [0, 00) satisfying d(x, y) ^ p(x, y) ^ Kd(x, y) for all 



^Here, and in what follows, the relations <, > indicate the corresponding inequalities up to factors which are 
universal constants. The relation A x B stands for (A < B) A (A > B). 
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x,y G S. Additionally, there exists v G &s satisfying v (Bd,(x,r)) ^ \J fi(Bd(x, Kr)) for all x G X 
and r G [0, oo). Here if G (0, oo) is a universal constant. Since B p (x, r) C B^x, r)n5C B p (x, Kr) 
for all x G 5 and r G [0, oo), we have S2(S, d) ^ ^(S^ p) = 72 (S 1 , p) ^ Kj 2 (S, d), where we used the 
fact that 72(0 and 6 2 (-) coincide for ultrametrics. Hence, 



K l2 (S,d) > 5 2 (S,d) > inf [°° J log ( — -i -V > inf f 



\\ l0S [y/»(B d (x,Kr))) dr 




1 WM^2^0. (5) 



KV2x£sJ y * \n(B d (x,r)) J KV2 K^2 

This completes the deduction of Talagrand's nonlinear Dvoretzky theorem from Theorem 11.1 



Remark 1.3. It is easy to check (see [T6l Lem. 6]) that *y 2 (S,d) ^ 272(X, d) for every S C X 
(and, even more trivially, 6 2 (S, d) ^ 5 2 (X,d)). Thus, it follows from ([5]) that y 2 (X,d) > S2(X,d) 
for every finite metric space (X, d) . 

Since the original 1987 publication of Talagrand's majorizing measures theorem, this theorem 
has been reproved and simplified in several subsequent works, yielding important applications and 
generalizations (mainly due to Talagrand himself). These proofs are variants of the same basic 
idea: a greedy top-down construction, in which one looks at a given scale for a ball on which a 
certain functional is maximized, removes a neighborhood of this ball, and iterates this step on the 
remainder of the metric space. It seems that the framework described here is genuinely different. 
The proof of Theorem 11.21 in [14] has two phases. One first constructs a nested family of partitions 
in a bottom-up fashion: starting with singletons one iteratively groups the points together based 
on a gluing rule that is tailor-made in anticipation of the ensuing "pruning" or "sparsification" 
step. This second step is a top-down iterative removal of appropriately "sparse" regions of the 
partitions that were constructed in the first step; here one combines an analytic argument with the 
pigeonhole principle to show that there are sufficiently many potential pruning locations so that 
successive iterations of this step can be made to align appropriately. Our new approach has the 
advantage that it yields distributional statements such as ([I]), the majorizing measures theorem 
itself being a result of integrating these pointwise estimates. 

1.2. Multiple measures. In anticipation of further applications of ultrametric skeletons, we end 
by addressing what is perhaps the simplest question that one might ask about these geometric 
objects: to what extent is the union of two ultrametric skeleton also an ultrametric skeleton? 

We show in Remark 14.31 that for arbitrarily large D\,Di G [l,oo), one can find a finite metric 
space (X,d), and two disjoint subsets Ui,U% C X, such that each U{ embeds into an ultrametric 
space with distortion Di , yet any embedding of U\ U U% into an ultrametric space incurs distortion 
at least {D\ + 1)(Z?2 + 1) — 1- In Section [4] we prove the following geometric result of independent 
interest (which, as explained above, is sharp up to lower order terms). 

Theorem 1.4. Fix Di,D 2 G [l,oo). Let (X,d) be a metric space and Ui,U 2 Q X. Assume that 
(Ui,d) embeds with distortion D\ into an ultrametric space and that (U 2 ,d) embeds with distortion 
D 2 into an ultrametric space. Then the metric space (U\ U U 2 ,d) embeds with distortion at most 
(Di + 2){D 2 + 2) — 2 into an ultrametric space. 

Consequently, one can always find an ultrametric skeleton that is "large" with respect to any 
finite list of probability measures. 

Corollary 1.5. For every e G (0,1) let C £ be as in Theorem li.il Let (X,d) be a compact metric 
space, and let fj,\, . . . ,fj,}~ be Borel probability measures on X. Then there exists a compact subset 



4 



SCI, and Borel probability measures ui,...,vj- supported on S, such that S emebds into an 
ultrametric space with distortion at most (0(l)/e) k and for every x G X and r G [0, oo) we have 
Ui (B d (x, r)) < (pi(B d (x, Cer)) 1 - 6 for alii G {1, . . . , k}. 

2. Proof of Theorem 11.11 
A submeasure on a set X is a function £ : 2 X — > [0, oo) satisfying the following conditions. 

(a) m = 0, 

(b) A 1 QA 2 QX => C(Ai) ^ £(A 2 ), 

(c) {iifeCl e(U fe j^) <E* e j^)- 

If in addition £(X) = 1 we call £ a probability submeasure. 

Lemma 2.1. Let (U,p) be a compact ultrametric space, and let £ : 2 U — > [0, oo) be a probability sub- 
measure. Then there exists a Borel probability measure v on U satisfying v(B p (x,r)) ^ t;(B p (x,r)) 
for all x £ X and r G [0, oo). 

Remark 2.2. It is known [81 [15] that there exist probability submeasures that do not domi- 
nate any nonzero measure (in the literature such measures are called pathological submeasures). 
Lemma |2. II shows that probability submeasures on ultrametric spaces always dominate on all balls 
some probability measure. 

Assuming the validity of Lemma 12. II for the moment, we prove Theorem ll.il 

Proof of Theorem Let (X, d) be a compact metric space and p a Borel probability measure on 
X. By Theorem 11.21 there exists a compact subset SCI satisfying the covering estimate ([2]), and 
an ultrametric p : S x S — > [0, oo) satisfying d(x, y) ^ p(x, y) ^ ^-d(x, y) for all x,y G S, where K 
is a universal constant. 
For every AC5 define 

t{A) = \ni\Y J p{B d {x u c £ r i )) 1 - £ : {{x h n)} ieI C X X [0, oo) A [j B, 

In ([6]) the index set / can be countably infinite or finite, with the convention that an empty sum 
vanishes. One checks that £ : 2 s — > [0, oo) is a submeasure on S. Moreover, for every x £ X and 
r G [0, oo), by considering B d (x,r) as covering itself, we deduce from © that 

£ (S n B d (x, r)) < p (B d (x, c e r)f- £ . (7) 

Since p is a probability measure and X has bounded diameter, it follows from ([7|) that £(5) ^ 1. 
The covering estimate ([2]) implies that £(<S) ^ 1, so in fact £ is a probability submeasure on S. 

An application of Lemma l2~T1 to (S, £) yields a Borel probability measure v supported on S 
and satisfying u(B p (y,r)) ^ t;(B p (y,r)) for all y G S and r G [0, oo). Fix x G X and r G [0, oo). 
The desired estimate ([I]) holds trivially if B d (x,r) n 5 = 0, so we may assume that there exists 
y £ S with <i(x, y) ^ r. Thus 

S n B d (x, r) C S n 5 rf (y, 2r) C B p (y, ^r^j C S D B d (y, ^r^j Q S H B d (x, (l + 

It follows that 

v(B d (x, r)) < !/ f B p (y, —r) ) < £ ( 5 p f y, ^-r 



dCxi.rODilL (6) 



2£T\ \\ / / / 2K\ N N ! ~ 



^£\SnB d I x, 11 + — 1 rjj^p \B d I x, I 1 + . 
This completes the deduction of Theorem 11.11 from Theorem 11.21 and Lemma 12.11 □ 
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Prior to proving Lemma 12.11 we review some basic facts about compact ultrametric spaces; 
see [10] for an extended and more general treatment of this topic. Fix a compact ultrametric space 
(U,p). For every r 6 (0,oo) we have \{B p (x,s) : (x,s) £ U X [r, oo)}| < oo, i.e., there are only 
finitely many closed balls in U of radius at least r. Indeed, by compactness U contains only finitely 
many disjoint closed balls of radius at least r. Since B p (x,s) n B p (y,t) G {0, B p (x,s),B p (y,t)} for 
every x,y £U and s,t G [0, oo), assuming for contradiction that : (x,s) E U X [r, oo)} is 

infinite, we deduce that there exist {(xj, Sj)}^ C[/x [r, oo) satisfying B p (xi,Si) C B p (xi + i, Sj + i) 
for all i G N. Fix G Sj + i) \ -B p (xj, Sj). If « < j then ^ B p (xj,Sj) D B p (x i+ i, s i+1 ) and 

2/i G S p (a; i+1 ,s i+ i). Hence s i+1 < p{yj,x i+1 ) max{p(^, yi), p(y h x i+1 )} ^ max{p(y i , y { ), s i+1 }. 
It follows that p(yi,yj) > Sj+i ^ r for all j > i, contradicting the compactness of {U,p). 

A consequence of the above discussion is that for every x G U and r G (0, oo) there exists 
e G (0,oo) such that B p (x,r) = B p {x + e). Therefore B p (x,r) = B p (x, r + e/2). Similarly, since 
B°(x,r) = U<5e(Or/2] Ppi x -> r ~ where there are only finitely many distinct balls appearing this 
union, there exists 5 G (0, r/2] such that B°(x,r) = B p (x,r — 5). Thus every open ball in U of 
positive radius is also a closed ball, and every closed ball in U of positive radius is also and open 
ball. Consider the equivalence relation on U given by x ~ y <?=^ p(x,y) < diam p (C/). This is 
indeed an equivalence relation since p is an ultrametric. The corresponding equivalence classes are 
all of the form B°(x, diam p (U)) for some x G U. Being open sets that cover U, there are only 
finitely many such equivalence classes, say, {B\, B 2 , . . . , B^ 1 }. By the above discussion, each of 
the open balls B\ is also a closed ball, and hence (B\,p) is a compact ultrametric space. We can 
therefore continue the above construction iteratively, obtaining a sequence {-Pj}^L of partitions of 
U with the following properties. 

(1) Po = {U}. 

(2) Pj is finite for all j. 

(3) Pj+i is a refinement of Pj for all j. 

(4) Every C G Pj is of the form B°(x,r) for some x G U and r G [0,oo). 

(5) For every j, if C G Pj is not a singleton then there exists x\,...,Xk G U such that 
{B°(xi, diam p (C))}^ =1 C Pj+i, the open balls {B°(xi, diam p (C))}^ =1 are disjoint, and 
C = Ut l B°(x i ,dmm p (C)). 

(6) linx/^oo maxcePj diam p (C) = 0. 

(7) For every x G U and r G (0, oo) there exists j such that B°(x,r) G Pj. 

The first five items above are valid by construction. The sixth item follows from the fact that 
for all j G N either Pj-i consists of singletons or max^gp^. diam p (C) < maxcePj-i diam p (C). 
Since for every r G (0, oo) there are only finitely many balls of radius at least r in U, necessarily 
limj_ i . 00 maxcgp^. diam p (C) = 0. To prove the seventh item above, assume for contradiction that 
(x,r) G U x (0, oo) is such that B p (x,r) ^ Pj for all j. Since the set {B°(x, s)} s ^ r is finite, and 
B°(x, diam(J7) + 1) = U G Po, we may assume without loss of generality that B°(x,s) G \J^L Q Pj 
for all s G (r, oo). But since B p (x,r) = B p (x,s) for some s G (r, oo), it follows that B p (x,r) G Pj 
for some j. In particular, B°(x,r) ^ B p (x,r), implying that diam p (i? p (x, r)) = r. Therefore by 
construction B°(x,r) G Pj+i, a contradiction. 

Proof of Lemma \2. 1[ Let {Pj}°^L be the sequence of partitions of U that was constructed above. 
We will first define v on S = \Sj=§ Pj U {^}' which is the set of all open balls in U (allowing the 
radius to vanish, in which case the corresponding open ball is empty). Setting v{X) = 1 and 
f(0) = 0, assume inductively that v has been defined on Pj. For C G Pj+i let D G Pj be the 
unique set satisfying CCD. There exist disjoint sets C±,...,Cf, G Pj+i, with C G {C±, . . . , C^}, 
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Kg) = ^\L, (8) 



such that D = C x U • • • U C k . Define 

g(g) 

E-=ie(a 

This completes the inductive definition of z/ : S — > [0, oo). 

We claim that one can apply the Caratheodory extension theorem to extend u to a Borel measure 
on U. To this end, note that S is a semi-ring of sets. Indeed, S is closed under intersection since 
CnD £ {0, C, D} for all C, D £ S. We therefore need to check that for every C,D G S, the set ZKC 
is a finite disjoint union of elements in S. For this purpose we may assume that -D\C 7^ 0, implying 
that CCD. Assume that C £ Pj and D G Pj for i < j. Let Ci,...,Cfc 6 Pj be the distinct 
elements of Pj that are contained in D, enumerated so that C = C\. Then D \ C = C 2 U • • • U C&, 
and this union is disjoint, as required. 

In order to apply the Caratheodory extension theorem, it remains to check that if {A{\ C £L 1 C S 
are pairwise disjoint and USi e ^' then ^ (IJSi^i) = lLrtL\ v (Ai)- Since all the elements of 
S are both open and closed, compactness implies that it suffices to show that if A\,... ,A m £ S 
are pairwise disjoint and A\ U ■ • ■ U A m G 5, then z/(Ai U • • • U A m ) = ^(^i)- We proceed by 

induction on m, the case m = 1 being vacuous. For every i £ {1, . . . , m} there is a unique fcj G N 
such that Ai £ Pfc—i N Pfc r Define fe = maxjfci, . . . , A; m }. If k = 1 then necessarily m = 1 and 
Ai = C7. Assume that A; > 1 and fix j G {1, ... , m} satisfying fcj = fe. Let D G Pfc-2 be the unique 
element of Pk-2 containing Aj, and let C\, . . . , Cg £ Pu-i be the distinct elements of Pfe-i contained 
in D. Since Ai U • • • U A m is a ball containing Aj C D, we have Ai U • • • U A m D -D = C\ U • • ■ U Ci. 
By maximality of k it follows that Aj £ {C±, . . . ,Cg} C {Ai, . . . , A m }. For i £ {1, ...,£} let 

nj £ {l,...,m} be such that Cj = A n .. Since UJie{i,...,m}\{ni>— 

a)[JD = \J?=iAi, the 

inductive hypothesis implies that X)ie{i m}\{ni „ ( } + = ^(AiU' ■ - Uj4 m ). But by our 

definition © we have i/(Z>) = u(A ni )-\ \-v(A nt ), so that indeed u(A 1 U-- -UA m ) = K^i)- 

Having defined the Borel probability measure 1/, it remains to check by induction on j that if 
C £ Pj then v(C) < f(C). If j = then C = U and i/(I7) = £(*7) = 1. If j ^ 1 then let 
D G P/-i satisfy CCD. There exist disjoint sets Ci, ...,C& £ Pj+i, with C G {Ci,...,Cfc}, 
such that D = Ci U • • • U C&. Since £ is a submeasure, £(D) ^ £(d.) + ' ' ' + £(Cfc)- By the 
inductive hypothesis f(-D) ^ £{D). Our definition (jSJ) now implies that v{C) ^ £(C). The proof of 
Lemma |2. II is complete. □ 

3. J 2 (X,d) AND 6 2 (X,d) 

Let (X, d) be a finite metric space. For every measurable (ft : (0, 00) — > [0, 00) define 



and 



7<p(X,d)= inf sup/ (ft(fi(B d (x,r)))dr, 

/■oo 

fy(X,d) = sup inf / (ft(n(B d (x,r)))dr. 

fiG^x xeX JO 

Thus 72(0 = Jtj>(-) and 5 2 (-) = ^(") f° r 00*0 = \/log(l/x). The following lemma is a variant of an 
argument of Bednorz [21 Lem. 4]. The elegant proof below was shown to us by Keith Ball; it is a 
generalization and a major simplification of our original proof of the estimate 8 2 (X,d) > 72 (X, d). 

Lemma 3.1. Assume that (ft : (0, 00) — > (0,oo) is continuous and lirn r _ J , + 4>{ x ) = °°- Then 
5^{X,d)^ 1<t> {X,d). 

Proof. Write X = {x±, . . . ,x n }. Thus &x can be identified with the (n — l)-dimensional simplex 
A n -i = {(Ml, • • • ,Mn) S [0, l] n ; ^1 H h = 1} (by setting //{a;*}) = ^). 
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Mm) 



Define fi,...,f n - A n _i -> [0, oo) by 

if m = 0, 

(l + / °° 0Gu(£(^,r)))dr) ~ l if ^ > 0. 

Writing 5(//) = £" =1 /i(/-0> w e define F : A n _i -)• A n _i by = (/i(/u), . . . , f n {p))/S{p). Since 
is continuous and linx r _>. + ^(sc) = oo, all the fi are continuous on A n _i. Since each p G A n _i has 
at least one positive coordinate, S(p) > 0. Thus F is continuous. Note that by definition F maps 
each face of A n _i into itself. By a standard reformulation of the Brouwer fixed point theorem (see, 
e.g., [71 Sec. 4.29± + ]), it follows that /(A n-i) = A n _i. In particular, there exists p G A n _i for 
which -F(/i) = . . . , 1/n). In other words, there exists p G such that J* °° (j)(p(B c i(x, r)))dr 

does not depend on x G X. Hence, 



oo roo 



fypr,d)>inf/ H^i B d(x, r)))dr = sup / <f>{p,{B d {x, r)))dr ^ ^(X, d). □ 

x ^ x Jo x&XJo 

Lemma 3.2. Assume that <f> : (0,oo) — > [0, oo) is non-increasing. Let (U,p) be a finite ultrametric 
space. Then 6<f,(U, p) ^ j ( f l (U,p). 

Proof. We claim that if p, v are nonnegative measures on U satisfying p(U) v(U) then there 
exists a £ U satisfying p{B p (a,r)) ^ v{B p (a,r)) for all r G (0,oo). This would imply the de- 
sired estimate since if p,, v G are chosen so that sup^g^ f °° 4>(p(B p (x,r)))dr = ^(U, p) and 
mf x< z X J °° 4>{p{B p {x,r)))dr = 5 2 (U,p), then 

f'OO f'OO 

J<t>(U,p)> 0(p(B p (a,r)))dr > cp(v{B p {a,r)))dr > 5 2 (U, p). 
Jo Jo 

The proof of the existence of a G U is by induction on \U\. If \U\ = 1 there is nothing 
to prove. Otherwise, as explained in Section [2j there exist xi,...,Xk G U such that the balls 
{B°(xi, 6iam p (U))}f =1 are nonempty, pairwise disjoint, and (J^ =1 diam p (C/)) = U. It follows 

that Y^t=\ p{B p {xi,diam. p {U))) = p(U) ^ v(U) = J2i=i v {B° p {xii diam p ([/~))). Consequently there 
exists i G {1, ...,&} such that p{B° p (xi, di&m p (U))) ^ u(B°(xi, diam p (C/))). By the inductive hy- 
pothesis there exists a G B°(xi, diam p ([/)) satisfying p{B p {a, r)) ^ u(B p (a, r)) for all r < diam p ([/). 
Since for r ^ diam p (C7) we have B p (a,r) = U, the proof is complete. □ 

A combination of Lemma 13.11 and Lemma 13.21 yields the following corollary. 

Corollary 3.3. If (ft : (0, oo) — > [0,oo) is non-increasing, continuous, and lim x _ il0 + <ft{x) = oo, then 
S^U^p) = ^^{U^p) for all finite ultrametric spaces (U,p). 

Remark 3.4. Consider the star metric d n on {0, 1, . . . , n}, i.e., d n (0, i) = 1 for all i G {1, . . . , n} and 
d n (p, q) = 2 for all distinct p, q G {1, . . . , n}. The measure v on {0, 1, ... , n} given by ^({0}) = 
and v{{i}) = 1/n for i G {1, . . . , n}, shows that ^({0, 1, . . . , n},d n ) ^ 2 v / Iog~n. At the same time, 
the measure p on {0, 1, . . . ,n} given by p({0}) = 1/2 and p({i}) = l/(2n) for i G {1, . . . , n}, shows 
that 72 ({0,l,...,n},d n ) ^ ^log(2n) + y/\og(2n/(n + 1)) ^ (l/2 + o(l))<5 2 ({0, 1, . . . , n}, d n ). Thus, 
unlike the case of ultrametric spaces, for general metric spaces it is not always true that 72 (X, d) = 
5 2 (X, d). Of course, due to Lemma [3TT1 and Remark 11.31 we know that 72 (X, d) x S 2 (X, d). It seems 
plausible that always 5 2 {X,d) ^ 2j 2 (X,d), but we do not investigate this here. 

4. Unions of approximate ultrametrics 

In this section we prove Theorem 1 1 . 41 and present some related examples. Below, given a partition 
P of a set X, for x G X we denote by P(x) the element of P to which x belongs. 
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Lemma 4.1. Fix D\,D2 ^ 1. Let (X,d) be a metric space and let Ci,?7 2 Q X be two bounded 
subsets of X. Assume that (Ui,d) embeds with distortion D\ into an ultrametric space and that 
(U2,d) embeds with distortion D2 into an ultrametric space. Then for every e 6 (0,1) there is a 
partition P ofU±(JU2 with the following properties. 

• For every C 6 P, 

diam d (C) <(!-£) diam^E/i U U 2 ), (9) 



where 



5 def 2e£> 2 

(D 1J D 2 + 2D 1 + 2 J D2 + 2)( J D lJ D 2 + 2 J D 1 + 2 J D 2 + 2 + e)' 1 ' 

• For every distinct C±, C 2 £ P, 

, rr diam^lq U £/ 2 ) 

Proof. By rescaling we may assume that diarm>([/i U C/ 2 ) = 1. Let pi be an ultrametric on U\ 
satisfying d(x,y) ^ p\(x,y) ^ D±d(x,y) for all x,y £ U\. Define 

def Z?i£> 2 + 2£>1 

a = , 12 

D 1 D2 + 2D 1 + 2D2 + 2' v ' 

and consider the equivalence relation on U\ given by x ~i y <^=^ pi(x, y) ^ a (this is an equivalence 
relation since p\ is an ultrametric). Let {Ei}i e i C 2 171 be the corresponding equivalence classes. 
Thus 

diam d (£i) < diam Pl (^) ^ a (13) 
for all i £ I, and for distinct i,j £ I we have 

^,^.)^^^^. (14) 

Let p 2 be an ultrametric on U2 satisfying d(x,y) ^ p 2 (x,y) ^ D 2 <i(x,y) for all x,y £ ?7 2 . Define 

ft def ^2 

Z) 1 L> 2 + 2 J Di + 2D 2 + 2 + £' 1 J 

and consider similarly the equivalence relation on ?7 2 given by x ~ 2 y <^=^ p 2 (x,y) ^ 6. The 
corresponding equivalence classes will be denoted {Fj}j e j C 2 C/2 . Thus 

diam d (Fi) ^ diam pi (Pi) ^ 6 (16) 

for all i £ J, and for distinct i,j £ J we have 

(17) 



For every i £ I denote 



where 



Ji = {j£j: d(Ei,Fj)^c}, (18) 



def 1 

L>iL> 2 + 2Di + 2Z) 2 + 2 V ; 
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Note that for every j G J there is at most one i G I for which j G Jj. Indeed, if j G Jj H J^, where 
i 7^1, then 

Efc) < Fj) + diam d (F,) + d{F J} E e ) 

< 2c + 6 

JTSJaCD 2 D 2 



< 



£>i£> 2 + 2Di + 2D 2 + 2 D 1 D 2 + 2Di+2D 2 + 2 + e 

2 + Z>2 JT2I a 

D X D 2 + 2Di + 2D 2 + 2 = A' 



Contradicting ([Tj 

Consider the partition P of C7i U U2 consisting of the sets 




and {^\^i} je j.( Uie/Jl ) 

iei 

It follows from ([Tip . (117j> and (jT5]l that for every distinct C 1 ,C 2 eP, 
d(C l5 C 2 )^mm(^-,^- jC ] ' 



£>i D 2 J £>i£>2 + 2£>i + 2L> 2 + 2 + £ 

Thus the partition P satisfies (fTTj) . It also follows from (fT3|) . (fT6l) and (fT8l) that for every C G P, 

diam d (C) s£ a + 26 + 2c = (1 - 5), 

where we used the definitions (fT0|) , (fT2|) , (fT5j) , (fT9|) . Thus the partition P satisfies ([9]) , completing 
the proof of Lemma 14.11 □ 

Proof of Theorem \l-4\ Assume first that U\,U2 are bounded. Define a sequence {Pk}tLo of par- 
titions of C7i U U2 as follows. Start with the trivial partition Pq = {U\ U U2}, and having defined 
Pfc, the partition Pk+i is obtained by applying Lemma [4TT1 to the sets U% D C and £/ 2 n C for each 
C G Pk- Then the partitions {Pfcj^o nave t ne following properties. 

• Pfc+i is a refinement of Pk, 

• for every C G P& we have 

diam d (C) < (1 - diam d (*7i U £/ 2 ), (20) 

• for every distinct C%, C 2 G Pfc+i such that Ci, C 2 C C for some C G Pfc, we have 

It follows from (|2U|) that for every distinct x, y G U1UU2 we have Pfc(x) 7^ P&(y) for k ^ large 
enough. Thus for distinct x,y G £/i U J7 2 let k(x,y) denote the largest integer fc ^ such that 
Pfc(x) = Pfe(y)- Define 

diam d (P fc(a . iy) (x)) x / y, 
x = y. 



Then p is an ultrametric on XJ\ U J7 2 . Indeed, for distinct x, y, z G E7i U C/2 let /c ^ be the largest 
integer such that Pfc(x) = Pfc(y) = Pk( z )- Then = min{A;(x, z), k(y, z)} and Pfc(x) D Pk(x,y)( x ), 
implying that p(x,y) = diam^ (P k r x y \{x)j ^ diam^ (P^(x)) = max{p(s,z),p(j/,z)}. For distinct 
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6 U\ U C/2, since x,y G Pk(x,y)( x )i we have p(x,y) = diam^ ^ d(x,y), while since 

f&(a:,j/)+i( a:; ) / - p fc(^, J/ )+i(y) 1 w e deduce from ([21]) that 

d(x,y) ^ d(-Pfc( a: ,j / )+i(a ; ),-ffc(a ; ,j / )+i(y)) 

diam d (P fc(3 . |y) (x)) = 

^ L>i£> 2 + 2Di + 2D 2 + 2 + e ~ D X D 2 + 2Di + 2D 2 + 2 + e ' 

The above argument shows that if C/ 2 are bounded, the metric space (U\ UU 2 ,d) embeds with 
distortion D\D 2 + 2D\ + 2D 2 + 2 + e into an ultrametric space for every e G (0, 1). 

For possibly unbounded U\,U 2 C X, fix x G X, and for every n G N let p n be an ultrametric 
on (Ui n -B d (x , n)) U (U 2 D B d (x Q , n)) satisfying 

d(x, y) ^ p n (x, y) (0^2 + 2D 1 + 2D 2 + 2 + ^) d(x, y) 

for all x, y G {U\ D Bd(xQ,n)) U (U 2 n Bd{xQ,n)). Define also p n (x, y) = if {x, y} is not contained 
in (U\ n Bd(xo,n)) U (^fl Bd(xo, n)). Let ^ be a free ultrafilter on N and set 

Poo Or, y) = f lim p n {x,y). 

n— ^ 

Then Poo is an ultrametric on U\U U 2 satisfying d ^ p^ $C [D\D 2 + 2Z?i + 2Z?2 + 2)d. D 

Remark 4.2. There are several interesting variants of the problem studied in Theorem 11.41 For 
example, answering our initial question, Konstantin Makarychev and Yury Makarychev proved 
(private communication) that if (X, d) is a metric space and E\,E 2 C X embed into Hilbert 
space with distortion D ^ 1 then {E\ U E 2 ,d) embeds into Hilbert space with distortion f(D). 
Their argument crucially uses Hilbert space geometry, and therefore the following natural question 
remains open: if (X, d) is a metric space and E\, E 2 C X embed into L\ with distortion D ^ 1, does 
it follow that (EiL)E 2 ,d) embeds into L\ with distortion /(D)? Regarding unions of more than two 
subsets, perhaps even the following (ambitious) question has a positive answer: if E\, . . . , E n C X 
embed into Hilbert space with distortion D ^ 1, does it follow that (E\ U . . . U E n , d) embeds into 
Hilbert space with distortion 0(logn)f(D)? If true, this statement (in the isometric case D = 1) 
would yield a very interesting strengthening of Bourgain's embedding theorem [3j, which asserts 
that any n-point metric space embeds into Hilbert space with distortion O(logn). 

Remark 4.3. The following example shows that Theorem 1 1.41 is sharp up to lower order terms. Fix 
two integers M, N ^ 2, and write MN = K(M+N)+L, where K G N and L G {0, 1, ... , M+N-l}. 
Consider the following two subsets of the real line: 

K-l 

Ui= (J {i(M + N),i(M + N) + 1, . . . ,i(M + N) + M - 1} , 

i=0 
K-l 

U 2 = \J {i{M + N) + M,i{M + N) + M + 1, . . . ,(i + 1)(M + N) - 1} . 

i=0 

For i G {1,2}, let Di ^ 1 be the best possible distortion of Ui (with the metric inherited from M) 
in an ultrametric space. For i%, i 2 G {0, . . . , K — 1} and ji,j 2 G {0, . . . , M — 1} define 

f i\ = i 2 A ji = j 2 , 

Pi{h{M + N)+jx,i 2 (M + N)+j 2 ) = I M-l h = i2 * h^h, 

[ (K- 1)(M + N) + M- 1 h^i 2 . 
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Then p\ is an ultrametric on XJ\ satisfying \x — y\ ^ p±(x,y) ^ (M — l)\x — y\ for all x,y E C/i. 
Hence Di < M - 1. Similarly, for zi, i 2 € {0, . . . , K - 1} and ji, j 2 G {M, . . . , M + iV - 1} define 

( 2i = i 2 A ji = J 2 , 

p 2 (ii(M + N) + ji,i 2 (Af + iV) + j 2 ) = iV-1 i x = i 2 A ji / j2, 

[ (if-l)(M + JV) + iV- 1 ii^t 2 . 

Then p 2 is an ultrametric on f/ 2 satisfying — y| ^ p\{x,y) (iV — y| for all x,y £ ?7 2 . Hence 
D 2 ^ iV- 1. But f/i U C/ 2 = {0, 1, . . . , #(M + AT) - 1}, and hence any embedding of Ui U f7 2 into an 
ultrametric space incurs distortion at least K(M -\- N) — 1 (see for example [121 Lem. 2.4]). Observe 
that this lower bound on the distortion equals MN - L - 1 ^ (M - 1)(N - 1) - 1 ^ DiL> 2 - 1- 
When L = (e.g., when M = iV = 25 or M = 2N = 6S for some 5 £ N), the above distortion 
lower bound becomes MN — L — 1 (.Di + 1)(-D 2 + 1) — 1. Thus one cannot improve the bound 
in Theorem 11.41 to L>iD 2 , i.e., additional lower order terms are necessary. 
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Witold Bednorz, Rafal Latala, Gilles Pisier, Gideon Schechtman, Michel Talagrand. 
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